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Abstract. We examine the computational complexity of context-free languages, mainly concen- 
trating on two well-known structural properties — immunity and pseudorandomness. An infinite 
language is REG- immune (resp., CFL-immune) if it contains no infinite subset that is a regular 
(resp., context-free) language. We prove that (i) there is a context-free REG-immune language 
outside REG/n and (ii) there is a REG-bi- immune language that can be computed determin- 
istically using logarithmic space. We also show that (iii) there is a CFL-simple set, where a 
CFL-simple language is an infinite context-free language whose complement is CFL-immune. 
Similar to the REG-immunity, a REG-primeimmune language has no polynomially dense sub- 
sets that are also regular. We further prove that (iv) there is a context-free language that is 
REG/n-bi-primeimmune but not even REG-immune. Concerning pseudorandomness of context- 
free languages, we show that (v) CFL contains REG/n-pseudorandom languages. Finally, we 
prove that (vi) against REG/n, there exists an almost 1-1 pseudorandom generator computable 
in nondeterministic pushdown automata equipped with a write-only output tape and (vii) against 
REG, there is no almost 1-1 weak pseudorandom generator computable deterministically in linear 
time by a single-tape Turing machine. 
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1 Motivations and a Quick Overview 

The context-free language is one of the most fundamental concepts in formal language theory. Besides its 
theoretical interest, the context-freeness has drawn, since the 1960s, practical applications in key fields 
of computer science, including programing languages, compiler implementation, and markup languages, 
mainly attributed to unique traits of context-free grammars or phrase-structure grammars. Some of the 
traits can be highlighted by, for instance, pumping and swapping lemmas [6t I29j. normal form theorems 
[HI [T3] , and undecidability theorems [SI [T^] , all of which reveal certain substructures of context-free lan- 
guages. The literature over half a century has successfully explored numerous fundamental properties 
(including operational closure, normal forms, and minimization) of the family CFL of context-free lan- 
guages. The family CFL contains a number of non- regular languages, such as Leg — {0"1" \ n > 0} and 
Equal = {w e {0,1}* I 4/^0(11]) — ^i{w)}, where #b(w) denotes the number of b's in w. An effective use 
of a pumping lemma, for instance, easily separates them from the family REG of regular languages (see, 
e.g., [TB] for their proofs). Nonetheless, these two context-free languages look quite different in nature and 
in complexity. Is there any extremely "complex" context-free language? Since time-complexity might not 
bet a suitable complexity measure for context-free languages, another way to measure their complexity is to 
show "structural" differences among those languages. 

Numerous structural properties have been proposed for polynomial-time complexity classes, including 
P (deterministic polynomial-time class) and NP (nondeterministic polynomial-time class), and have been 
studied to understand their characteristics. Many of those properties, which are important on their own 
light, have arisen naturally in a context of answering long-unsettled questions, such as the P =?NP question 
(see, e.g., [?] for these properties). To scale the complexity of context-free languages, we wish to target 
two well-known structural properties — immunity and pseudorandomness — which have been studied since the 
1940s in computational complexity theory and computational cryptography. These two properties are known 
to be closely related. In this paper, we shall spotlight them within a framework of formal language theory. 
Our approach may differ from standard ones in a setting of polynomial-time bounded computation. 
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In the first part of tliis paper (Sections [3HSI), our special attention goes to languages that have only 
computationally "hard" non-trivial subsets. Those languages, known as immune languages and simple 
languages, naturally possess high complexity. Given a fixed family C of languages, an infinite language is 
called C -immune if it has no infinite subset in C, and a C-simple language is an infinite language in C whose 
complement is C-immune. Significantly, the C-immunity satisfies a self-exclusion property: C cannot be C- 
immune. Notice that the notion of simplicity has played a key role in the theory of NP-completeness (see, 
e.g., 15). In addition, a language is C-bi-immune if its complement and itself are both C-immune. 

These notions of immunity and simplicity date back to the 1940s, in which they were first conceived by 
Post |24| for recursively enumerable languages. Their resource-bounded analogues were discussed later in the 
1970s by Flajolet and Steyaert [TT]. During the 1980s, Ko and Moore [H] intensively studied such limited 
immunity, whereas Homer and Maass [16j explored resource-bounded simplicity. The bi-immunity notion 
was introduced in mid-1980s by Balcazar and Schoning ^5j. Since then, numerous variants of immunity and 
simplicity (for instance, strong immunity, almost immunity, balanced immunity, and hyperimmunity) have 
been proposed and studied extensively (see, e.g., for references therein). 

Despite the past efforts, in a setting of polynomial-time bounded computation, the immunity notion has 
eluded from our full understandings; for instance, it has been open whether there exists a P-immune set in 
NP or even an NP-simple set since the existence of such a set immediately yields a class separation between 
NP and co-NP. While there is a large volume of work on the immunity of polynomial-time complexity classes, 
there has been little study done on the immunity of context-free languages in the past literature. An analysis 
of REG-immunity inside CFL could bring into new light a structural difference among various context-free 
languages. For instance, the aforementioned context-free language L^q is REG-immune llj, whereas its 
accompanied language Equal is not REG-immune. The context-freeness provides tremendous advantages 
of proving immunity and non-immunity over polynomial-time complexity classes. Unlike NP-simplicity, we 
can demonstrate that CFL-simple languages actually exist. 

There are, however, unsettled questions concerning the REG-immunity in CFL. One of those questions 
is related to REG-bi-immunity. It is unclear that REG-bi-immune languages actually exist inside CFL. At 
our best, we can prove that the language class L (deterministic logarithmic- space class) contains REG-bi- 
immune languages. Another unsolved question concerns with a density issue of immune languages. Notice 
that all known REG-immune languages L in CFL have exponentially small density rate |L fl S"|/|S]"|. The 
REG-immune language L^q, for instance, has density rate \Leq n {0,l}"|/2" < 1/2" for any even length 
n; in contrast. Equal, which is not even REG-immune, has its density rate \Equal n {0, l}"|/2" > 1/n for 
any sufficiently large even number n. Naturally, we can ask if there exists any context-free REG-immune 
language whose density \L n S"| is lower-bounded by a "polynomial" fraction, i.e., l/p{n) for a certain 
non-zero polynomial p. Such a condition is referred to as polynomially dense or p-dense. In this paper, as 
the first step toward the open question, we shall show that there exists a p-dense REG-immune language in 
L. 

Our C-immunity requires the non-existence of an infinite subset in C. Is there any language that lacks 
only p-dense subset in C? Such a natural question gives rise to a variant of C-immunity, referred to as C- 
primeimmunity. We turn our attention to this natural notion inside CFL. As an example, we shall prove that 
an "extended" language oi Equal, Equals, — {qw \ a G {A, 0, 1}, w G Equal}, is REG/n-primeimmune, where 
REG/n is obtained from REG by supplementing appropriate "advice" of size n [26j. In stark contrast with 
the REG-bi-immunity, we shall prove that REG-bi-primeimmune languages (even REG/n-bi-primeimmune 
languages) exist inside CFL. 

The second part of this paper (Sections [7HS1) is exclusively devoted to a property of computational ran- 
domness or pseudorandomness. An early computational approach to "randomness" began in the 1940s. 
Church's [10 random 0-1 sequences, for instance, demand that every infinite subsequence contains asymp- 
totically the same number of Os and Is. This line of study on computational randomness, also known as 
stochasticity, concerns with asymptotic behaviors of random sequences. It has been known a connection 
between stochasticity and bi-immunity. 

To suit our study of context-free languages, we rather examine non- asymptotic behaviors of randomness 
inside languages. This paper discusses the following type of "random" languages. We say that a language L 
is C -pseudorandom if, for every language ^ in C, the characteristic function XA agrees with xl on "nearly" 
50% of strings of each length, where "nearly" means "with a negligible margin of error." Our notion can be 
seen as a variant of Wilber's [27j randomness, which dictates an asymptotic behavior of xl and xa- 

Similar in the case of primeimmunity, p-denseness requires our special attention. Targeting p-dense 
languages, we introduce another "randomness" notion, called weak C -pseudorandomness, as a non-asymptotic 
variant of Miiller's t23j balanced immunity, Loveland's [3T] unbiasedness, or weak-stochasticity of Ambos- 
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Spies et al. 2 . Loosely speaking, a language L is weak C-pseudorandom if the density rate |_Ln^nS"|/|^nS"| 
is "nearly" a half for every p-dense language A'm C. 

A typical example of REG/n-pseudorandom language is /P* whose strings are of the form auv with 
a £ {A,0, 1} and |u| — \v\ such that the binary inner product between and v is odd. We show a close 
connection between pseudorandomness and primeimmunity. From this connection, we can conclude that 
/P* is also REG/n-bi-primeimmune. The aforementioned language Equals, can separate the notion of weak 
REG-pseudorandomness from the notion of REG-primeimmunity. 

In early 1980s, Blum and Micali [7| studied pseudorandom generators, which produce unpredictable 
sequences. Our formulation of pseudorandom generators, attributed to Yao [30j . use indistinguishability 
from uniform sequences. Loosely speaking, a pseudorandom generator is a function producing a string that 
looks random for any target adversary (in this case, we say that the generator fools it). In our language 
setting, we call a function from S* to S* with stretch factor s{n) (that is, |/(a;)| = s(|a;|)) a pseudorandom 
generator against a language family C if, for every language A in C, G fools every language in C. Our 
pseudorandom generator tries to fool languages in a sense that, over strings inputs of each length n, the 
outcome distribution of the generator is indistinguishable against the strings of length s(n); that is, the 
function £(n) — \Pvohx[xA{x)) = 1] — Vvohy[xA{y) = 1]| is negligible, where x and y are chosen uniformly 
at random from S" and S*("\ respectively. We shall prove that, against REG/n, there exists an almost 
1-1 pseudorandom generator computable by a nondeterministic pushdown automaton equipped with an 
output tape. As a limitation of generators, we can show that, even against REG, there is no almost 1-1 
pseudorandom generator computable by a one-tape one-head linear-time deterministic Turing machine. 

2 Foundations 

The natural numbers are nonnegative integers and we write N to denote the set of all natural numbers. We 
set N"*" = N — {0} for convenience. For any two integers rn,n with m < n, the notation [m^n]z stands for 
the integer interval {m, m -I- 1, m + 2, . . . , n}. The symmetric difference between two sets A and B, denoted 
AAS, is the set {A — B) [B — A). In this paper, all logarithms are assumed to have base two unless 
otherwise stated. Let log^^-* n = logn and log*-*"*"^^ n = log(log*-*-' n) for each number i G N"*". A function 
fx from N to M-° (all nonnegative reals) is called noticeable if there exists a non-zero polynomial p such 
that /i(n) > l/p{n) for all but finitely-many numbers n in N. By contrast, /i is called negligible if we have 
fj.(n) < l/p{n) for any non-zero polynomial p and for all sufficiently large numbers n € N. 

Our alphabet, often denoted S, is always a nonempty finite set. A string is a series of symbols taken from 
E, and the length of a string x is the number of symbols in x and is denoted For simplicity, the empty 
string is always denoted A. For two strings x and y, xy denotes the concatenation of x and y. In particular, 
Xx coincides with x. The notation S" denotes the set of all strings of length n. For any string x of length n 
and for any index i G [0, n]z, prefi{x) is the substring of x, made up with the first i symbols of x. For each 
string € E* and any symbol a £ E, the number of a's appearing in w is represented by ^a{w)- A language 
over an alphabet E is a subset of E*, and the characteristic function XA of A is defined as XAix) = 1 if 
X G A and XAix) = otherwise for every string x S E*. 

For any language L over E, the complement E* — i of L is often denoted L whenever E is clear from 
the context. Furthermore, the complement of a family C of languages is the collection of all languages whose 
complements are in C. We use the conventional notation co-C to denote the complement of C. The notation 
dense{L){n) expresses the cardinality of the set LC\ E"; that is, dense{L){n) = |Ln E"|. A language L over 
E is called tally if L C {a}* for a certain fixed symbol a G E. 

This paper mainly discusses regular languages and context-free languages. We assume the reader's basic 
knowledge on fundamental mechanisms of one-tape one-head one-way finite automata, possibly equipped with 
pushdown (or first-in last-out) stacks. See, e.g., [T71 (TH] for the formal definitions of the aforementioned 
finite automata. Generally speaking, for a finite automaton M, the notation L{M) represents the set of 
all strings "accepted" by M under appropriate accepting criteria. Such criteria may significantly differ if 
we choose different machine types. Conventionally, we say that M recognizes a language L if L = L{M). 
Languages recognized by deterministic finite automata (or dfa's) and nondeterministic pushdown automata 
(or npda's) are respectively called regular languages and context-free languages. In addition, deterministic 
pushdown automata (or dpda's) recognize only deterministic context-free languages. For ease of notation, 
we denote by REG the family of regular languages and by CFL the family of context-free languages. As a 
proper subclass of CFL, DCFL denotes the family of all deterministic context-free languages. 

It is known that the language family CFL is not closed under conjunction (see, e.g., [18]). This fact 
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inspires us to introduce a restricted conjunctive closure of CFL. For any positive integer k, the k conjunc- 
tive closure of CFL, denoted CFL(fc), is the collection of all languages L such that there are k languages 
Li, L2, . . . ,Lk in CFL for which L — Li H L2 H ■ ■ ■ H Lk- In particular, CFL(l) coincides with CFL itself. 

We describe technical but useful tools — two pumping lemmas and one swapping lemma — for regular 
languages as well as context-free languages. See [SJ [TBI HI] for their proofs. 

Lemma 2.1 1. [pumping lemma for regular languages] Let L he any infinite regular language. There 
exists a number m > (referred to as a pumping-lemma constant) such that, for any string w of length 
> m in L, there is a decomposition w = xyz for which (i) \xy\ < m, (ii) \y\ > 1, and (Hi) xy^z € L 
for any i G N. 

2. [pumping lemma for context-free languages] Let L be any infinite context-free language. There exists 
a positive number m such that, for any w G L with \w\ > m, w can be decomposed as w ~ uvxyz with 
the following three conditions: (i) \vxy\ < m, (ii) \vy\ > 1, and uv''xy^z is in L for any i G N. 

3. [swapping lemma for regular languages] Let L be any infinite regular language on alphabet S with 
jSj > 2. There exists a positive integer m (called a swapping-lemma constant) such that, for any 
integer n > 1 and any subset S of L D S" of cardinality at least m, the following condition holds: for 
any integer i G [0,n]z, there exist two strings x = x\X2 and y = yiy2 in S with \xi\ = \yi\ = i and 
\x2\ = \y2\ satisfying that (i) x ^ y, (ii) yiX2 S L, and (Hi) Xiy2 G L. 

To explain the notion of advice, we first adapt a "track" notation [ I ] from |26j . For any pair of 
symbols ct G Si and r G S2, the notation [ !^ ] denotes a new symbol made from a and r. For two strings 
X = X1X2 ■ ■ ■ Xn and y ~ yiy2 ■ ■ - yn of the same length n, the notation [ ^ ] is shorthand for the string 
[ HI ][ V2 ] ■ ■ ■ [ Hn ]■ advice function is a map / from N to F*, where F is an appropriate alphabet. For 
any family C of languages, the advised class C/n denotes the collection of languages L over an alphabet E 
for which there exist another alphabet F, an advice function ft, : N — > F*, and a language A G C such that, 
for every string x G E*, (i) = \x\ (i.e., length preserving) and (ii) x & L iS [ ^jj^^j, ] G A p6l [29]. 

As an additional computation model, we introduce the notion of one-tape one-head off-line Turing ma- 
chines whose tape heads move in all directions. All tape cells of an infinite input/ work tape are indexed 
with integers and an input string of length n is given in the cells indexed between 1 and n surrounded 
by two designated endmarkers. We take a notation l-DTIME(i(n)) from to denote the collection of 
all languages that are recognized within time t{n) by those machines. As a special case, write 1-DLIN for 
l-DTIME(0(n)). It is well-known that REG = 1-DLIN = l-DTIME(o(nlogn)) 

To handle (multi- valued partial) functions, we further consider Turing machines that produce output 
strings. Conventionally, whenever a single-tape machine halts along the tape that contains only a block 
of non-blank symbols beginning at the left endmarker and surrounded only by blanks, we treat the string 
given in this block as an outcome of the machine. A (partial) function / from E* to F*, where E and F are 
alphabets, is called length preserving if |/(a;)| = \x\ for any string x in the domain of /. 

Let us introduce several function classes, which are natural extensions of our language families REG and 
CFL. The function class 1-FLIN is the set of all single-valued total functions computable in time 0{n) by 
one-tape one- head off-line deterministic Turing machines whose tape heads move in all directions. Similarly, 
the notation 1-FLIN (partial) expresses the set of all single- valued partial functions / such that there exists 
a one-tape one-head off-line deterministic Turing machine M that starts with input x and halts with output 
f{x) by entering an accepting state whenever f{x) is defined and M enters a rejecting state when f{x) is 
not defined. 

In a similar fashion, we define 1-NLINMV as the class of all multi-valued partial functions / for which 
there exists a one-tape one-head off-line nondeterministic Turing machine M, provided that all computation 
(both accepting and rejecting) paths terminates with output values in time 0(n), with the condition that 
f{x) consists of all output values produced along accepting paths. Notice that, when f{x) = 0, there should 
be no accepting path. See [55] for their fundamental properties. 

The original npda model was introduced to recognize languages. Let us expand this model to compute 
(partial) functions. For this purpose, we equip an npda with an additional output tape and its associated 
tape head. Now, our npda has two tapes: a read-only input tape and a write-only output tape. This new 
npda acts as a standard npda with a single stack except for moves of an output-tape head. In the write-only 
output tape, its tape head always moves to the right whenever it writes a symbol in its tape cell. We also 
allow each tape head to stay still while it scans a blank symbol but does not write any non-blank symbol. 
Since the head moves only to a new blank cell, it cannot read any symbol that have already written in 
the output tape. Along each computation path, we define an output as follows. When the npda enters 
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an accepting state, we treat the string produced on the output tape as an output of the machine. On the 
contrary, when the machine enters an rejecting state, we assume that the machine produces no output along 
this path. Hence, the machine can produce more than one output value or no output value. Hence, such an 
npda in general computes a multi-valued partial function. Let CFLMV denote the collection of all multi- 
valued partial functions that can be produced by certain npda's. Let CFLSV consist of all single-valued 
partial functions in CFLMV. When all functions / are limited to be total (i.e., f{x) is always defined), we 
use the notation CFLSVt. Note that, for every language L, L e CFL iff xl G CFLSVt. 

3 Resource-Bounded Immunity and Simplicity 

Intuitively, an immune language contains only finite subsets and infinite subsets that are "hard" to compute; 
in other words, it lacks any non-trivial "easy" subset. In contrast, a simple language inherits the immunity 
only for its complement. Such languages turn out to possess quite high complexity. The original notions of 
immunity and simplicity are rooted in the 1940s and later adapted to computational complexity theory in 
the 1970s with various restrictions on their computational resources. 

The notion of resource-bounded immunity for an arbitrary family C of languages can be introduced in 
the following abstract way. A language L is said to be C -immune if (i) L is infinite and (ii) no infinite 
subset of L exists in C. When a language family T) contains a C-imniune language, we briefly say that T) is 
C -immune. Since C cannot be C-immune, if T) is C-immune then it immediately follows that T) <^ C. On the 
contrary, the separation I? ^ C cannot, in general, guarantee the existence of C-immune languages inside 2?. 
By this reason, a separation between two language families by immune languages is sometimes referred to 
as a strong separation. In a polynomial-time setting, for instance, even if assuming that P 7^ NP, it is not 
known whether there is a P-immune language in NP or equivalently NP is P-immune. 

Within a framework of formal language theory, we extensively discuss the immunity of two well-known 
families of languages: REG and CFL. Earlier, Flajolet and Steyaert presented two apparent examples: a 
REG-immune language Lgg = {0"1" | n g N} and a CFL-immune language L^^q = {a"b"'c"' \ n G N}. Notice 
that, in contrast, similar non-regular languages Equal = {x G {0, 1}* | #0(2^) = #1(2;)} and SEqual — {x £ 
{0,1,2}* I #o(a;) = = #2(2;)} are not REG-immune, because two regular languages {(01)" | n G N} 

and {(012)" | n e N} are respectively infinite subsets of Equal and of 3Equal. This clear contrast signifies 
a structural difference among those languages. 

Unlike the well-studied P-immunity, we can take quite different approaches toward REG-immune and 
CFL-immune languages because these immunity notions embody unique characteristics. For example, in 
many cases, as we shall see later, a diagonalization technique — a standard technique of constructing immune 
languages in a polynomial setting — is no longer necessary. 

Since REG C CFL, the CFL- immunity clearly implies the REG- immunity but the converse does not hold 
because, for instance, Lseg is REG-immune and also belongs to CFL. Since Lf,q and iseg are tally languages 
(because, e.g., dense{Leq){n) = 1 for all even lengths n G N), they belong to the advised class REG/n. In 
particular, since L^q is in DCFL, we can conclude that DCFL n REG/n is REG-immune. Similarly, since 
Lseq e CFL(2), the language family CFL(2) n REG/n is CFL-immune. 

As the first nontrivial example of REG-immunity, we want to show that the language family DCFL — 
REG/n is REG-immune, complementing the aforementioned REG-immmiity of DCFL n REG/n. 

Proposition 3.1 The language family DCFL — REG/n is 'REiG-immune. 

Our REG-immune language is a "marked" language Pal^ ~ {w^w^^ \ w e {0,1}*} over the ternary 
alphabet {0,1, #}, where # is used as a separator. Notice that a use of this separator is crucial because 
a corresponding unmarked version Pal — {ww^ \ w G {0, 1}*} (even-length palindromes) is no longer 
REG-immune. Although the proof of Proposition 13. II is relatively easy, we include it for completeness. 

Proof of Proposition 13.11 We shall show that Pal^ is REG-immune and is also located outside of 
REG/n. First, we shall prove the REG-immunity of Pal^. Assume on the contrary that Pal^ has an 
infinite regular subset A. Let to be a pumping-lemma constant (in Lemma l2.ir i)) and, since A is infinite, 
choose a string w = w^w^ in A with w € {0, 1}* and \w\ > to. Let us consider any decomposition of the 
form w — xyz with \xy\ < to and \y\ > 1. Because xy is a substring of the string xy'^z cannot be of the 
form v^v^ for any string v £ {0, 1}*. This contradicts the conclusion of the pumping lemma, and therefore 
A does not exist. As a result, Pal^ is indeed REG-immune. 
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Second, we shall prove that Pal^ is not in REG/n. Instead of applying the swapping lemma (i.e., Lemma 
I2.ir 3)) directly, we want to prove our claim by applying a known result of Pal ^ REG/n [29]. In what follows, 
we want to show that if Pal^ G REG/n then Pal £ REG/n, which contradicts the fact that Pal ^ CFL/n. 
Therefore, it immediately follows that Pal^ does not belong to REG/n. Write S for the ternary alphabet 
{0, !,#}. Now, let us assume that Pal^ £ REG/n. There are an alphabet F, an advice function h from 
N to r*, and a language L S REG such that, for every string x £ S*, (i) = |a;| and (ii) x G Pal^ 

iff [ fc(^^i) ] e L. For later convenience, assuming that {0, 1} C F, we define h'{n) = [ gk^^jk ] if n = 2fc + 1, 
and h'{n) = [ '"q"' ] otherwise, and moreover, we define L' = {[ ^, ] | 3u,v G fI^I[w ] G 

L & 3m, m'[v = 0'"10™ ]]}. Clearly, L' is regular and, for every string x, x E Pal^ iff [ ] G L' . 

Our plan is to remove the separator # out of Pal^^ resulting in Pal. To carry out this plan, we 
introduce a new Turing machine M and a new advice function g for Pal. If /i'(n + 1) is of the form 
[ 0^-1 ][ ''o?o"' ][ o^"'i ]. then we define g{n) = [ ^k^i ][ 7// ][ ^k^i ]. On the contrary, if /i'(n + 1) is of 
the form [ ], then let g(ri) = [ ]. Our two-way off-line Turing machine M, which is equipped with a 
single input/work tape, behaves as follows. 

On input [ ], M checks if |x| is even, w is of the form [ " ] with \u\ — \v\, and v is of 
the form [ grk ][ "1°^ ][ \ for certain elements u\,U2 G S* and cti,(T2,o'3 e E. If not, M 
rejects the input. Otherwise, from the input string [ ^ ], M generates [ "^*"^ ], where vJ = 
[ 0™ ][ "'o'A'^ ][ o"«' ]' on tlic single tape, and M then checks if [ "i*"' ] e 

It is not difficult to check that M halts in time 0(n). Since 1-DLIN ~ REG the set of all strings 
accepted by M belongs to REG. Moreover, we can show that, for every input x G {0, 1}*, x G Pal iff M 
accepts [ g(|"^|) ]. From this equivalence, we conclude that Pal is in REG/n, as we have planned. □ 

Away from the REG-immunity, we next discuss CFL-immune languages. As noted before, the language 
family CFL(2)nREG/n (and thus CFL(2)nCFL/n) is CFL-immune; however, it is not known that CFL(2)- 
CFL/n is also CFL-immune. Instead, we demonstrate in the following proposition that L — CFL/n is CFL- 
immune, where L consists of all languages recognized by deterministic Turing machines with a single read-only 
input tape and a logarithmic-space bounded work tape. 

Proposition 3.2 The language family L — CFL/n is CFL-immune. 

As a CFL-immune language outside of CFL/n, we plan to consider a marked version of the language 
Dup = {ww \ w G {0, 1}*} (duplicating strings), which is denoted Dup^; namely, Dup^ — {w^w \ w G 
{0, 1}*}. The major reason for using this marked language is that, similar to the case of Pal, Dup is not 
even REG-immune. 

Proof of Proposition 13.21 Letting S = {0, 1, #}, we shall show that Dup^ is CFL-immune but not in 
CFL/n. Our first claim is the CFL- immunity of Dup^. Toward a contradiction, we assume that Dup^f, 
has an infinite context-free subset A. Let to be a pumping-lemma constant (in Lemma I2.ir 2l). take any 
string V = w^w with \w\ > m, and let v = xyz be any decomposition satisfying that \xy\ < m and \y\ > 1. 
Since \xy\ < to, the string xy^z should be of the form w'^w {w' ^ w) and thus it cannot belong to Dup^, 
a contradiction against the pumping lemma. Therefore, the conclusion that Pal^ is CFL-immune follows 
immediately. 

Our next claim is that Dup^ ^ CFL/n. Now, assuming otherwise that Dup^ e CFL/n, we take an 
advice function h and a context-free language L such that, for every string a; G E*, a; G Dup^ iff [ ^^^y^ ] G L. 
Similar to the proof of Proposition l3.H we can show the existence of another language L' G CFL and another 
advice function g such that, for any string x G {0, 1}*, x G Dup iff [ ^^^^y ] G L' . This equivalence concludes 
that Dup belongs to CFL/n, contradicting the fact that Dup ^ CFL/n 29]. As an immediate consequence, 
we obtain the desired result that Dup^ ^ CFL/n. □ 

The immunity notion has given rise to the notion of simplicity. In general, a language L is called C- 
simple if (i) L is infinite, (ii) L is in C, and (iii) L is C-immune. The existence of such a C-simple language 
clearly leads to a class separation C ^ co-C. Because of this implication, we do not know whether NP-simple 
languages exist (since, otherwise, NP ^ co-NP follows). It is therefore natural to ask if CFL-simple languages 
actually exist. In what follows, we shall prove the existence of such CFL-simple languages. 
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Proposition 3.3 There exist CFIj-simple languages. Moreover, the complements of some of those lan- 
guages belong to CFL(2) n REG/n. 

Our example of CFL-simplicity is the complement of a language Lkeq (k > 3), which is a natural 
generalization of L^^q . Let fc > 3 be fixed. We define Lkeq = {a" 02 ' ' ' | G N} over the /c-letter alphabet 
Sfe = {ai,a2, ■ ■ ■ ,ak}- We shall show that the complement of Lkeq, where A; > 3, is indeed CFL-simple. 
This gives a clear contrast with the fact that an associated language SEqual = {w e {a,b,c}* \ #q(w) — 
4kb{w) = i^ciw)} is not even REG- immune. 

Proof of Proposition [331 We intend to show that, for each index fc > 3, (1) Lkeq is in CFL, (2) Lkeq is 
in CFL(2) n REG/n, and (3) Lkeq is CFL-immune. 

(1) Our first claim is that Lkeq belongs to CFL. To simplify our proof, we shall argue only on the case 
L^eq- Let us introduce two additional languages L^neq — {a'^b^c"^ \ fc 7^ /,/ 7^ m, or fc 7^ m} and L3 = 
{a'^fo'c™ I k,l,m ^ N}. Note that L^neq equals the union of the following three sets: {a*''6'c™ \ k l,m > 0}, 
{a^Uc™ \ I ^ rn,k > 0}, and {a^Uc"^ \ m k,l > 0}, all of which are apparently context-free. Since CFL is 
closed under union, L^neq belongs to CFL. Moreover, since L^eq = ^Sneg U L3 and L3 £ REG C CFL, the 
language L^eq is also in CFL. 

(2) To show that Lkeq G REG/n, choose an advice function h defined as h{n) — cHl^^a!^^^ ■ ■ ■ ci^^ for 
all numbers 71 = (mod k) and h{n) = 0" for all the other rt's. If we define = {[ ^ ] | w € S^}, then 
[ ''(|™|) ] ^ holds exactly when w = h{\w\), which means w G Lkeq- Thus, it follows that Lkeq G REG/n, 
as requested. To show that Lkeq G CFL(2), let us deal only with the case where k — 2m and m = 2j + I for 
a certain number j G N^, since the other cases are similar. We introduce two useful languages Li and L2 
defined as follows. 

• Li consists of all strings of the form a^'^Oj^ • • • a^*" such that rii = Uk+i^i for all indices i € [1, to]z. 

• L2 consists of all strings of the form a^'^Oj^ • • • a^'° such that n2i+i = n2i+2 and n2i+m+i = n2i+m+3 
for aU i e [0, 

Clearly, Li and L2 are both context-free. Since the target language Lkeq can be expressed as Lif] L2, Lkeq 
belongs to CFL(2). 

(3) Finally, we shall check the CFL-immunity of Lkeq- Assume that there exists an infinite subset 
A G CFL of Lkeq- Let m be a pumping-lemma constant (in Lemma [2.1( 2)). Choose w = a"a5 • ■ ■ in ^ 
with n > m. Take a decomposition w = uvxyz with \vxy\ < m and \vy\ > 1 such that Wj = uv^xy^ z is in 
A for every index i. Since \vxy\ < m < n, there is an index i such that vxy is a substring of either a" or 
a"a"^_j. Thus, there are only two cases: (i) v and y are both substrings of a" or (ii) w is a substring of a" 
and y is a substring of a"_^_i. In either case, the string W2 = uv'^xy^z cannot belong to A. This is absurd and 
therefore A does not exist. We thus reach a conclusion of the CFL-immunity of Lkeq- ^ 

Notice that our CFL-simple languages Lkeq is not even REG-immune because, for instance, L3 is an 
infinite regular subset of L^eq- It is still open whether there is a REG-immune CFL-simple language. 

In the remaining portion of this section, we briefly discuss a density issue of REG-immune languages. Note 
that all context-free REG-immune languages L shown in this section satisfy the following density property: 
its density rate dense (L)(n)/|I]"| is "exponentially small" in terms of a length parameter n. The language 
Pal^, for example, satisfies that dense{Pal^){n)/\T,'^\ < 2L"/2J/3» (thus dense{Pal^){n) < |S"|/(2.2")) 
for every odd length n > I. Naturally, we can question if there exists a context-free REG-immune language 
whose density rate is "polynomially large." To be more precise, we call a language L over an alphabet S 
polynomially dense (or p-dense, in short) exactly when there exist a number rtg S N and a non-zero polynomial 
p such that dense{A){n) > |I]"|/p(n) for all numbers n > uq. Our previous question is now rephrased as: is 
there any p-dense REG-immune language in CFL, or is CFL p-dense REG-immune? Unfortunately, we are 
unable to settle this question at present; however, we can show that L H CFL/n is p-dense REG-immune. 

Proposition 3.4 The language family L H CFL/71 is p-dense REG-immwne. 

Let us consider the language LCenter = {auO^lO^w | a e {A, 0,1}, 2™ < |u| = < 2™+^} over the 
alphabet {0, 1}. We claim that this language is REG-immune and also p-dense. Notice that LCenter is in 
CFL/n. 

Proof of Proposition [374l We want to show that LCenter is p-dense REG-immune. We first show that 
LCenter is p-dense. Let w = auO™10™z; in LCenter with 2"* < |m| = \v\ < 2™+^. Let n = |w;|. Consider the 
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case where a — X. In this case, we have 2™ < |u| = (n — 2m— l)/2 < 2™+^, which imphes 2™+^ + 2m+ 1 < n. 
Thus, n2 > 22'"+!. Since dense{LC enter) (n) = 2"-2™-\ we obtain 

dense{LC enter) (n) _ 2"-2™-i _ 1 1 

The other cases where a S {0, 1} are similar. Therefore, LCenter is p-dense. 

Next, we show that LCenter is REG-immune. Assuming otherwise, we choose an infinite subset A of 
LCenter in REG. We use Lemma [2.11 Take a pumping-lemma constant m > 0. Let w = auO'^'lO*''?; be any 
string in A with k > m and 2*^ < \u\ ~ \v\ < 2*^+^. Consider the case where a = A. The other cases are 
similar. 

Consider any decomposition w = xyz with \xy\ < m and \y\ > 1. Consider wq = xz. Since \y\ < m, y is a 
substring of u. Since k > m, the center symbol of wq should be 0. Thus, wq cannot belong to LCenter. This 
is a contradiction against the conclusion of the pumping lemma. Therefore, LCenter must be REG-immune. 

□ 



4 Properties of Immune Languages 

Immune languages lack infinite subsets of certain complexity and therefore they are of relatively high com- 
plexity. We have presented a few REG-immune languages in the previous section. For a much better 
understanding of REG-immunity, we intend to examine the fundamental nature of the REG-immunity by 
studying its relationships to other notions, such as quasireduction, hardcore, and levelability. The first ex- 
ample concerns with non-regularity measure, which gives another characterization of the REG-immunity. 
The nonregularity NL(n) of a language L at n is the minimal number of equivalence classes E"/ =l, where 
the relation =l is defined as: x =l y iff Vz € S*[a;z € L <^=^ yz & L\. 

Proposition 4-1 A language L is KEG-immune iff L is infinite and, for every infinite subset A of L and 
for every constant c > 0, NA{n) > c holds for an infinite number of indices n G N. 

This proposition is a natural extension of the so-called Myhill-Nerode Theorem [17] , which bridges between 
the nonregularity and REG. We include its proof only for completeness. 

Proof of Proposition 14.11 (If-part) We prove a contrapositive. Assume that L has an infinite subset A 
in REG. Since A € REG, by the Myhill-Nerode Theorem [T7], the cardinality of the set E*/ =a is constant, 
say, c (not depending on n). In other words, Na(ji) is upper-bounded by c. 

(Only If-part) Let {Ai,A2, ■ ■ ■ ,} be a set of equivalence classes in E*/ =l. Take the lexicographically 
minimal string, say, from each set Ai. Consider a dfa M with its transition function S defined by: 
5{i, a) = j iff Qia =l aj. The set of final states is F = {z | a.^ S L}. It is not difficult to check that Af indeed 
recognizes A. This implies that A is regular, a contradiction. □ 

Our notion of 1-DLIN-m-quasireduction gives another characterization of the REG-immunity. Recall 
from Section [2] the partial function class l-FLIN(partial). A l-'DLlN-m- quasireduction from L to A is a 
single-valued partial function / that satisfies the following two conditions: for every string x, (i) when f{x) 
is defined, x € L iff f{x) £ A and (u) / is in l-FLIN(partial). 

Lemma 4-2 The language L is KEG-immune iff L is infinite and for any set B and for any 1-DLIN-m- 
quasireduction f : L B and for any u £ B , f~^(u) is finite. 

Proof. (^) Assume that L is not REG-immune. Take an infinite regular subset ACL. Choose an 
element uq £ A and, for every string x, define f{x) = uq if x € A and undefined otherwise. Clearly, / 
belongs to l-FLIN(partial). 

(<^=) Assume that we have an infinite set L, another set A, a 1-DLIN-m-quasireduction f : L ^ A, and 
an element uq G A such that f~^{uo) is infinite. Consider the set B = f^^{uo). Obviously, B is infinite. 
Note that x £ B iS M{x) halts in an accepting state and outputs uq. Hence, B is in REG. Therefore, L has 
an infinite regular subset. □ 

Next, we give a "hardcore" characterization; however, our definition of "hardcore" is quite different from 
a standard definition of a (polynomial) hardcore for polynomial-time bounded computation (see, e.g., [3| for 
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its definition). With a use of an npda, we rather place a space restriction on the size of a stack used by an 
npda. More accurately, for any npda M = {Q,Y,,T,5,q{j,z,F), any constant fc e N, and any input string 
a; S S*, we introduce the notation M{x)k as follows: (1) M{x)k = 1 if there is an accepting path of M 
on the input x with stack size at most fc; (2) M{x)k = if all computation paths of M on x are rejecting 
paths with stack size at most k\ and (3) M{x)k is undefined otherwise. A context-free language A is called 
a Yl¥iG -hardcore for a language S if, for any constant /c G N and any npda M recognizing A, there exists a 
finite set B C S such that M{x)k is undefined for all strings x E S — B. 

Proposition 4-3 The following two statements are equivalent. Let S be any infinite context-free language. 

1. The language S is KEG-immune. 

2. The language S is a REG-Ziardcore for S . 

Proof. (1 implies 2) We shall prove a contrapositive. Let S be any context-free language. Assuming that 
S is not a REG-hardcore for S, we plan to prove that S is not REG-immune. There exist a constant k G N 
and an npda M with L{M) — S such that, for every finite set B C S, M{x)k is defined (i.e., M{x)k S {0, 1}) 
for a certain input x G S ^ B. Now, let us introduce a new npda N as follows: on input x, we simulate M on 
X nondeterministically and, along each computation path, whenever its stack size exceeds k, we immediately 
reject x. Consider the set L{N) of all strings accepted by N. By the definition of N, it follows that L{N) C S. 

First, we claim that L(N) is regular. Since fc is a fixed constant, we can express the entire content of 
the stack as a certain new internal state. Tracking down this state, we can simulate TV using a certain 
nondeterministic finite automaton (or nfa). This implies that L(N) is regular. 

Next, we claim that L{N) is infinite. Notice that S is infinite. For every finite subset B of S, there is 
a string x G S — B satisfying x € L(N). From this property, we can conclude that L(N) is infinite. Since 
L{N) is an infinite regular subset of S, S is not REG-immune. 

(2 implies 1) Similarly, we first assume that a context-free language S is not REG-immune. This means 
that there exists a dfa M for which L{M) C S and L{M) is infinite. Since S is context-free, take an npda N 
that recognizes S. Now, let us define a new npda M' as follows: on input M' splits its computation into 
two nondeterministic computation paths and then simulates M and N along these paths separately. Clearly, 
L{M') = L{M) U L{N) = S. Choose k = \ and consider M'{x)k. For every string x € L(M), M'{x)k = 1 
follows since M is a dfa. Let B be any finite subset of S. Because L{M) — B is infinite within S, there exists 
a string x in S ~ B ioi which M'{x)k = 1. This implies that S cannot be a REG-hardcore for S. □ 

At the end of this section, we shall discuss a slightly weak immunity notion, known as almost immunity. 
A language L is said to be almost C -immune if L is the union of a C-immune set and a set in C. Since 
C-immune languages are almost C-immune, CFL naturally contains almost C-immune languages. Let us 
consider a simple example L — {0"a: | x E {0",l"},n € N}. This language L is almost REG-immune 
(because L = {0^" | n S N} U Leg) but obviously not REG-immune. When an infinite language L is 
not almost C-immune, it is said to be C-levelable. Since every C-levelable language is not C-immune, the 
levelability of a language strengthens its non-immunity. A language family 1) is C-levelable if T> contains a 
C-levelable language. Concerning NP-levelability, all "known" NP-complete languages are NP-levelable. 

Let us demonstrate two examples of REG-levelable languages. We have already seen in Section [3] that 
Equal and Pal are not REG-immune. We shall strengthen this fact by showing that Equal and Pal are 
both REG-levelable. Note that Equal is in CFL n REG/n and Pal is in CFL - REG/n 

Proposition 4-4 The languages Equal and Pal are both HEG-levelable. 

To show this proposition, we need a general statement on a necessary condition for a language to be 
REG-levelable. In our REG setting, we need to require a slightly different conditions (in comparison to 
a polynomial-time setting, see [2H])- We say that a language L is 1-DLIN -m-autoreducible if there exist a 
function / (called an autoreduction) and a linear-time one-tape one- head Turing machine M such that, for 
every string x, (1) M on the input a; outputs f{x) and {2) x G L iff f{x) € L. We say that a function / 
is length increasing if |/(a;)| > |a;| for every string x. We say that a function / is 1-DhlN -invertible if there 
exists a one-tape one-head off-line linear-time deterministic Turing machine M such that M{f{x)) outputs 
X for every string x. 

The proof of the following lemma is a simple modification of [IHl Lemma 5.4], which is based on an 
argument in [25]. We include the proof only for completeness. 

Lemma 4-5 Let L be any non-regular language. Lf L is 1-T)1AN -m-autoreducible by an autoreduction f 



9 



that is length-increasing and I -DUN -invertible, then L and L are both KEG -levelable. 

Proof. Assume that L is almost REG-immune and 1-DLIN-autoreducible by an autoreduction / that is 
length-increasing and 1-DLIN-invertible. Take B S REG and C, which is REG-immune such that L = BUC. 
Define D = {x \ x ^ B,f{x) G B}. Clearly, D S REG. We want to show that D is infinite, leading to 
a contradiction against the immunity of C. If D is finite, then C — {B U C) is infinite. Take zq £ D, 
which is the lexicographically largest element. Let x € C — {B U C), which is minimal such that > |zo|- 
Define H — {/^*''(a;) | i S N}, where /''■'(x) denotes the i-fold composition of / on a; (in particular, 
— x). Since / is 1-DLIN-invertible, H is in REG. We claim that H (1 B ^ 0, because, otherwise, 
F is an infinite subset of C, a contradiction. Thus, f^''\x) G D for a certain number k. This implies that 
< \zo\ < \x\, a contradiction. □ 

Proposition 14.41 is now easily proven by Lemma 14.51 

Proof of Proposition 14.41 Following Lemma 14.51 it suffices to show that Equal and Pal are both 
1-DLIN-m-autoreducible by certain autoreductions that are length-increasing and 1-DLIN-invertible. First, 
we consider the case Equal. Define our desired autoreduction / as f{x) = ccOl. It is easy to see that 
X S Equal iff f{x) S Equal. Moreover, / is length-increasing and 1-DLIN-invertible. Next, we show that 
Pal is length-increasing 1-DLIN-m-autoreducible. In this case, define our autoreduction / as f{x) = 0x0. 
Obviously, it holds that x E Pal iff f{x) G Pal. Obviously, / is length-increasing and 1-DLIN-invertible. □ 



5 Existence of Bi-Immune Languages 

The existence of natural REG-immune languages within CFL encourages us to search for much stronger 
"immune" languages in CFL. One such candidate is another variant of C-immunity, known as C-bi-immunity 
in [5], where a language L is C-bi-immune if L and its complement L are both C-immune. For brevity, a 
language family T) is said to be C-bi-immune if there is a C-bi-immunc language in T). Time-bounded 
bi-immunity has been known to be related to the notion of genericity, which corresponds to certain finite- 
extension diagonalization arguments (see, e.g., [T1[2S] for its connection). 

Is there any REG-bi-immune language in CFL? When we look at all the examples of context-free REG- 
immune languages shown in Section [31 they appear to lack the REG-bi- immunity property. Concerning the 
existence of REG-immune CFL-simple languages discussed in Section^ if CFL is not REG-bi-immunc, then 
no CFL-simple language can be REG-immune. Although we are unable to answer the question at this point, 
we instead prove that L n REG/n is REG-bi-immune. 

Proposition 5.1 The languages family L n REG/n is KEG-bi-immune. 

How can we prove this proposition? Balcazar and Schoning [Sj employed a diagonalization technique to 
construct a P-bi-immune language inside EXP (deterministic exponential-time class). A disadvantage of such 
a construction is that the constructed P-bi-immune language depends on how to enumerate all languages in 
P. In our proof, we rather present two REG-bi-immune languages explicitly. Our desired REG-bi-immune 
languages are Leven and Lodd given as follows: 

• Leven ^ {w £ {0, 1}* | 3fc G N [2fc < log'^^ |w| < 2k + 1]} U {A}, and 

• Lodd = {we {0, 1}* I 3fc e N [2fc + 1< log(2) |«;| < 2k + 2]} U {0, 1}. 

Notice that these two languages form a partition of {0, 1}*; namely, Leven^Lodd — {0, 1}* and Leven^Lodd = 
0. 

Proof of Proposition 15.11 It suffices to show that Leven and Lodd are both REG-immune because each 
of them is the complement of the other. For brevity, let S represent the binary alphabet {0, 1}. We begin 
with proving the REG-immunity of Leuen by contradiction. Assume now that there exists an infinite regular 
subset A of Leven- Take a pumping-lemma constant m > 0, given in Lemma l2.ir i). and choose a string w 
in A n E" for a certain length n with n > m -\- 1. Such n satisfies that 2k < log'^' n < 2fc -I- 1 for a certain 
number G N. The pumping lemma (i.e., Lemma l2.1f 1)) provides a decomposition w = xyz with \xy\ < m 
and \y\ > 1 for which Wi =def xy^z belongs to A for an arbitrary number i G N. Write € for the length of y. 
Toward a contradiction, there are two cases to consider. 
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Case 1: Consider the case where log^^-* n — 2k + 1. In this case, we choose i = n + 1. Since 1 < £ < m, 
the length \wi\ is sandwiched by two terms as 

2^^''^' =n< \w., \ = n + {i - 1)£ < n + ne < n(m + 1) < = 2'^^'"^^ . 

In short, it holds that 2fc + 1 < log'-^-' \wi\ < 2k + 2, implying that Wi is in Lodd- Since A n Lodd — 0, it 
immediately follows that Wi ^ A, a contradiction. 

Case 2: Consider the case where log'-^-' n < 2fc + 1. This means that 2^ < n < 2^ ^ — 1. When we 
choose i = \n{n — \ + 1, the length \wi\ can be lower-bounded by 

n(n — 1) „ , ,9 n2k + l 

\m\>n+——^ ■£ = n + n{n-l)=n^ >2^ 

In contrast, since n > m/2, we can upper-bound \wi\ as 

+ Ij ■ £ = + £ < + m < {n + 1)^ < 2^ 

The above two bounds together imply that 2fc + 1 < log*-^-' \wi\ < 2k + 2, concluding that m; e Lodd, a 
contradiction against the fact that € A. 

From the above two cases, we can conclude that A does not exist; in other words, L^ven is REG-immune, 
as requested. Similarly, we can show that Lodd is REG-immune. Since Leven = Lodd, the REG-bi-immunity 
of Leven and Lodd follows immediately. 

We still need to argue that L^ven and Lodd are both in L n REG/n. Since L n REG/n is closed under 
complementation, it suffices to show that Leven belongs to LHREG/n. First, we prove that Leuen G REG/n. 
Consider the following advice function h{n) = lO""! if L^^ven n S" ^ and h{n) = 0" if Lodd n E" ^ 0. 
Define a set A as A = | [ ] | \x\ = + 1, j/ G {0, 1}*}. It is obvious that, for every x, x G Lf-ven iff 
[ ] G A. Since A e REG, Leven belongs to REG/n. 

To show that Leven G L, we consider the following algorithm for Leven- 

On input x, if x = A then accept it. Assume that |x| > 1. With access to w on a read-only 
input tape, compute [log^^^ on its log-space work tape. If [log^^^ \w\~\ is odd, then accept the 
input; otherwise, reject it. 

It is not difficult to show that this algorithm recognizes L^ven using only logarithmic space. This completes 
our proof of the proposition. □ 



\wi\ <n + 



6 P-Denseness and Primeimmunity 

Non-immunity of a language guarantees the existence of a certain infinite subset that is computationally 
"easy." In practice, many non-REG-immune languages have infinite regular subsets of low density. In typical 
examples, there are infinite tally subsets {(01)" | n e N} and {(012)" | n e N} inside Equal and 3Equal, 
respectively. These subsets are not even close to be polynomially dense or p-dense. Moreover, as discussed 
in Section O it is unknown whether there exists a p-dense REG-immune language in CEL. This situation 
also signifies the importance of p-denseness. 

Apart from the standard C-immunity, we turn our attention to p-dense languages that lack only p-dense 
regular subsets. Such languages are referred to as C-primeimmune. More generally, for a language family 
C, we say a language L over E is C-primeimmune if (1) L is p-dense and (2) L has no p-dense subset in C, 
and a language family V is C-primeimmune if there exists a C-primeimmune language in V. This definition 
immediately yields the following self-exclusion property: C cannot be C-primeimmune. 

An obvious relationship holds between p-dense REG-immunity and REG-primeimmunity. If L is p-dense 
but not REG-primeimmune, then L contains a p-dense regular subset A. By the definition of p-denseness, 
A should be infinite and thus L must not be REG-immune. The next lemma therefore follows. 

Lemma 6.1 Let L he any language over an alphabet E with |E| > 2. If L is p-dense 'REiG-immune, then 
L is lUiiG-primeimmune. 
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An apparent example of REG-primeimmunity is the language LCenter given in Section [5] We have 
already shown that LCenter is p-dense REG-imniune; hence, Lemma l6 . 1 1 shows the REG-primeimmunity of 
LCenter. Unfortunately, LCenter is not context-free, and therefore this example is not sufficient to conclude 
that CFL is REG-primeimmune. Rather, we shall take a more direct approach to the REG-primeimmunity 
of CFL. 

Let us recall the context-free language Equal over the binary alphabet {0,1}. Since Equal is tech- 
nically not p-dense, we need to extend it slightly and define its "extended" language Equals — {aw \ 
a € {A, 0,l},?i' € Equal}. Despite Equal^,^s non-REG-immunity, we can prove that Equals, is REG- 
primeimmune. In the next proposition, however, we shall prove a slightly stronger statement: EquaU 
is REG/n-primeimmune. Such REG/n-primeimmunity signifies a stark difference from REG/n-immunity, 
since there exists no REG/n- immune language (because every infinite language L over an alphabet S has 
an infinite subset of the form {ax £ L \ h{\ax\) = ax} in REG/n, where (t = [ " ] and h is an advice 
function defined as h{n+ 1) = au if au is the minimal string in Ln S" and h{n+ 1) = 0"+^ otherwise). The 
REG/n-primeimmunity of Equals also draws an obvious conclusion that Equals ^ REG/n, because REG/n 
is not REG/n-primcimmune. 

Proposition 6.2 The language Equals is HEG/n-primeimmune. 

Proof. We start our proof with an easy claim that Equals is p-dense. For any sufficiently large even 
number n, by Stirling's approximation formula, the density of Equals can be estimated as 

den.e(i.,.«MH ^ (l + e (i)) > ^. 

When n is odd, on the contrary, we obtain 

2 • 2"-i 2" 

dense(EquaU)(n) ~ 2 • dense(Equal^)(n — 1) > > — . 

71 — 1 n 

The above two lower bounds clearly yield the desired p-denseness of Equals . 

Our next target is to prove the non-existence of p-dense subset of Equals, in REG/n. Assume otherwise 
that there is a p-dense set A C Equals in REG/n. Since A is p-dense, a certain constant d > 1 satisfies 
dense{A){n) > 2"/n'^ for all but finitely-many numbers n. Let m be a swapping-lemma constant for A (from 
Lemma [2.1f 3)) and let n be any sufficiently large number in N. We consider only the case that m is odd. 
The other case where m is even is similar and omitted. For each pair i,k £ [0, n]z, let Ai^k denote the set 
{x e A n E" I #o{prefi{x)) = k} so that A n S" can be expressed as A n E" = flLi (ULo ^^^k)- Here, we 
claim the following key statement of {Ai^k}i.k- 

Claim 1 There are an index i G [1, n]z and at least m distinct indices (fci, fc2, . . . , km) such that Ai^^. ^ 
for every index j G [1, m]z. 

Assuming that Claim [T] is true, let us choose m distinct indices (fci, . . . , k„i) and an index i that satisfy 
the claim. We then choose one string Xj from each set Ai^kj and define S — {xi,X2, . . ■ ,Xm}- Clearly, 
15*1 > m. By the swapping lemma (Lemma l2.1f 3)). there are two distinct strings x — xiX2 and y — yij/2 in •S' 
with \xi \ — \yi\ — i and \x2\ — \y2\ such that the swapped strings xij/2 and yiX2 belong to A. This leads to 
a contradiction because the choice of S makes Xiy2 satisfy #0(2^11/2) 7^ 1^1(2^12/2)- This contradiction leads 
to conclude that A does not exist, and therefore we finish the proof of Proposition 16.21 

Now, our remaining task is to prove Claim [T] Assume that this claim is false. Since m is fixed, we omit 
"to" in the rest of the proof. We abbreviate [to/ 2 J as toq for brevity. 

To simplify our description, we introduce new terminology: an m-index series E is {Em-i, Em, Em+i, ■ ■ ■ , En) 
with Ei C [0, i]z and \Ei\ = to for every index i e [to — 1, n]z. For each to- index series E and for any index 
I € [m — l,n]z, let Te^i = {w G E" | Vi G [to — l,^]z {4fo{prefi{w)) G Ei)}. Our assumption yields the 
existence of an appropriate m-series E = {Ejn-i,Ejn, • • ■ , En) for which A n E" = n"=m-i UkeE ^i,^- 

To estimate the cardinality [A n E"|, we further define a special TO-series D = {D,n~i, D„i, . . . , Z3„): for 
each index i £ [to — 1, n]z, let Di — [\i/2~\ — rriQ, \i/2~\ + too]z- The corresponding value Tjj^i is abbreviated 
as Se- In what follows, we claim that |5„| upper-bounds [An E"|. 

Claim 2 For any length n G N, [A n E"| < |S'„|. 

As an immediate consequence of Claim[51 since A is p-dense, we obtain a lower bound |5„| > 2"/?!'^ for 
all but finitely- many numbers n. In contrast, the following statement gives an upper bound of \Sn\- 
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Claim 3 There exists a constant c with 1 < c < 2 such that \Sn\ < c" for all sufficiently large numbers n. 

Together with the p-dcnseness of A, Clahxi [3] yields a relation 2"/n'^ < |5„| < c", from which we obtain 
c > 271-^^/". Since lim_oo = 1, we reach a conclusion c > 2, which clearly contradicts the choice of c 

in Claim [3l Therefore, Claim [T] holds. 

To complete the proof of our proposition, we need to prove Claims [2] and [H which are a core of our proof. 
We begin with the proof of Claim [3] by making a direct estimation of the target value \Sn\- 



Proof of Claim [3l Remember that m is an odd constant and, for simplicity, we assume that m > 5. Now, 
we introduce the notation a^*^'' as follows. For any two indices k e [1, to]z and i G [m — 1, n]z, let a^'^' denote 
the cardinality of the set {wb e | w G & € {0, 1} & ^o{wb) + k = [i/2] + mo + 1}. In particular, it 
always holds that a|^Li — o-m-i — 1- We see a quick example. When m — 5 and i = 6, we obtain ag^-* = 6, 



(2) _ „(4) _ j^g^ ^(3) _ and Cg^-* = 5. For each index j S [1, \Sj\ equals J2k=i '^f^ 



6 



A simple observation yields the following relations among a^-'^^'s: for each index k e [l,m — 2]^ and 
ie [mo + l,(n- l)/2]z, 

"2i+l ~ "2i-l "2i-l' "2i+l ~ "2i-l ^ ^"2i-l ^ ^2i-l ' "2i+l~"2i-l T^^"2i-1- \^ I 



We want to show that (*) Yl™=2 '^'21+1 — ^J2T=i'^2i+i ^'-'^ index i £ [mo,{n — l)/2]-z, where 7 is 
constant less than 1 independent of i. Assuming (*), since 



a 



m— 1 



fc=l k=2 

we obtain a recurrence: 



|52.+i| < 3^ a'^li + ^ a!^, < (3 + 7™) ^ ^^i = (3 + l)\S2^ 



k=l k=2 fc=l 



This recurrence has a solution |5„| < (3 + 7)"/^|5to|. Since \Sm\ is a constant and 1 < ^/3 + 7 < 2, Claim[3] 
immediately follows. 

Hereafter, we show (*). To show this, it suffices to show that (**) there exists a constant S > 1 for which 

Ylk=2 '^2i+i — '^('^2i+i + '^2i+i) index i e [mo, [n — l)/2]z, because, from this inequality (**), the 

following holds: 



m— 1 



111, III/ — _L / ^ \ I I Ij A. 



k=l k=2 ^ ' k=2 

Hence, the desired constant 7 should be defined as 1/(1/5+1), which is clearly less than 1. Our goal is 
therefore to prove (*), which follows as a special case of the next claim. 

Claim 4 For every index j £ [0,mo - l]z, Z]r=mo+i-j ^ + '^2i+i^^^^)' where Sj = 

22J + 1 _ 1. 

When j — mo — 1, this claim implies that J2^^=2 '^2i-i — '5mo-i(«2i-i + '^2i^\)- setting S — 6mo-i: 
we obtain (*). 

To end the proof of Claim [31 we need to prove Claim U] by induction on j. For the basis case j = 0, from 
[U we can estimate the sum Oj™}"] + a'^^^'^^ as 

Amo) I (mQ+2) _ (mo-1) , nJrno) , r, (mo + 1) , 9 (mo+2) , (mn+3) 
"2i+l ' "2i+l ^ "2i-l I ^"2i-l ' ^"2j-l ^^"2i-l ' "2i-l 

\ „(™o) I 9 (mo + 1) I ^(nio+2) _ 9„(»"o + l) 
— "2i-l ^"2i-l ^"2'i-l ~ ^"2i+l ' 

which yields the desired relation a2'i+i^^'' — '^o(i2l+i + '^2^+1'^'') since Sq — 1. 

Next, let us consider the induction step with < j < mo — 2. As our induction hypothesis, we assume that 

l^k=mo+i-]°'2'i+i ^ "jl»2j+i +^21+1 J- "6 want to snow tnat i^k=m.o-j ^^21+1 ^ "j+U»2j+i + 
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^2i+i ) • Here, we estimate the sum J2 



cto, 1 1 as 



fe=mo-j "21+1 

mo+j+2 mo + l+i 

E„(i) _ „(™o-J-l) I '^^('"''-j) I /, Jk) o„(™o+J+2) (mo+j+3) 

"2!+l ^ "21-1 +'J"2i-l 2^ "2'i-l ' •J"2j-1 I "2i-l 
k—mo—j k—mo~j 

<r „(™o-J-l) I q„(mo-j) I _^ r / (nio-i) _,_ (mo+i+2)\ o„("io+J+2) (rno+j+3) 
— "2i-l +'J"2i-l 1 *"jA"2i-l "'""2i-l 7 + 'J"2i-1 + "2i-l 

Since satisfies that 6j+i = ASj + 3, the above sum is further bounded as 

mo+j+2 
fc— mo— J 

from which we obtain the desired inequality for j — £ + I. By applying the induction, we obtain Claim U 
holds. □ 

What follows is the proof of Claim [21 which also proceeds by induction. 

Proof of Claim [2l Recall the notation af'^ and let c^^'' > cf'^ > • ■ ■ > cf^^ be an enumeration of all 

the elements in {a^*'*}i<i<m in a non-increasing order. For later convenience, whenever a'p = a^p with 
i < j, we place ahead of . As a quick example, when m = 5 and £ = 5, we obtain cj^"^ = a''g\ 
c^^ = al?\ c'q^ = ag'*\ = ag^\ and Cg^' — a'j^K This enumeration admits the following recurrence: 
J2i=i Cf+i = 12i=i '^t^ + 12i=i '^t^ fo'^ index k G [1, m]z, provided that c^™'*'"'^'' = 0. In particular. 

Recall the notation Te.i and write Te/IJ] for the set {w € Te.i \ i^o{prefi+i{w)) = j}. By our choice 

of the m-index series E, we have A n S" C Te.u- Similar to the enumeration of {a^p}i, we also enumerate 

all the elements in {TE.e[j]}jeEt as e^j^\ > e-^\ > • • • > e-^l- Note that \TE.n\ — YTiLi ^e n- claim the 
following. 

Claim 5 For any pair £ € [m — 1, n\z and k G [1, m\z, it holds that (*) X^iLi — X]i=i ^e\- 

By choosing £ — n and k — m in Claim[5l we obtain |5„| = J2iLi > X^I" i ^B^n ~ I^b.^I ^ ^"Ij 
as requested. The remaining task is to prove Claim [S] 

This claim can be proven by double induction on £ and k. When £ — m, the inequality (*) is true for 
any index k £ [l,m]z, because {e^\„}i coincides with {cm}i- Assume that m < £ < n and we target the 
case £+1. If (**) J2i=i ^sle+i ^ Yli=i ^sle + J2i=i ^eIv where e^/^^ = 0, then we have 

k k-1 k+1 fc-1 fc+1 k 

E{i) ^ \ " («) I \ " (j) ^ \ " (i) I \ " (i) _ \ " (i) 

2—1 i— 1 i— 1 i— 1 i— 1 i— 1 

where the second inequality follows from the induction hypothesis that 'Y^iZa '^i' — ^e\ ^'^'^ "Yi^i — 

Yli=i e-^E,f Therefore, we have X^Li 4+i ^ SLi ^B^+i' requested. 

Finally, we show (**). We proceed our proof by observing how to compute Te,i+i from Te.i- Consider a 
partition of Ei^i into a number of blocks, say, e'^^^^^ ^l+ii ■ ■ -j each of which has a form [p, q]z with p < q. 
Each block eP_, — [pi,qi]z defines tI'Ii i = U.ep(*> 2^-E/+ib]; whose cardinality Iri'li i| is bounded from 
above, similar to Equation ((1]), by |T£;,£[pj]| + '2J2pi<j<q, {TeAjW + iTsAltW- By summing up such |r^'_'^|'s, 
l^^ff+il can be upper-bounded by the sum of at least two terms, say, |T£;^£[p']| and |rE.^[g']| plus two times 
the sum of at most fc — 2 remaining terms {Te^i [j] \ 's. In other words, since jT^fl+i | = X]i=i ^^Ei+i ' we obtain 
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J2i=i ^s'i+i — J2i=i '^s'i + J2i=i ^e\- Claim [5] immediately follows. □ 

This completes the proof of Proposition [^1 D 

Unlike the REG- bi- immunity, it is possible to prove the existence of context-free REG/n-bi-primeimmune 
languages. Later, in Section [71 we shall show that a context-free language, called /P*, is indeed REG/n-bi- 
primeimmune. 

7 Pseudorandomness of Languages 

From this section to the next section, we shall discuss "computational randomness" of context-free lan- 
guages. Although there are numerous ways to describe the intuitive notion of computational randomness, 
we choose the following notion, which we prefer to call C -pseudorandomness to distinguish another notion of 
"C-randomness" used in the past literature. Let S denote our alphabet with |S| > 2 and let C be any lan- 
guage family. Roughly speaking, a language L over S is C-pseudorandom when the characteristic function XA 
of any language A inC agrees with xl over "nearly" 50% of strings of each length, where the word "nearly" 
is meant for "negligibly small margin." In other words, since LAA = {x G S* | XLix) ^ xa{x)}, the density 
dense{LAA){n) "nearly" halves the total size This new notion can be seen as a non-asymptotic variant 
of Wilber's randomness [57] (which is also referred to as Wilber-stochasticity in [5]) and Meyer-McCreight's 
randomness [25*. 

Let us formalize our intuitive notion. We say that a language L over E is C-pseudorandom if, for any 



language A over S in C, the function £{n) 



dense(LAA){n) 1 



is negligible; namely, for any non-zero 



polynomial p, there is a number tiq > 1 such that (*) — i < for all numbers n> uq. 

Assuming that £ C, we note that, by setting A ~ in (*), every C-pseudorandom language L should 
satisfy 

for any non-zero polynomial p and for all but finitely-many lengths n G N. 

Similar in spirit to the previous C-primeimmunity, we can naturally restrict our attention to p-dense 
languages in C. As a non-asymptotic variant of the notions of Miiller's balanced immunity ^23j and weak- 
stochasticity of Ambos-Spies et al. [2], we introduce another notion, called weak C-pseudorandomness, which 
refers to a language that "nearly" splits every p-dense set in C by half. Let C be any language family 
containing the set S*. Formally, a language L over S is called weak C-pseudorandom if, for every p-dense 



language in C, the function £'{n) 



is negligible. By choosing A — Y.* , provided that 



dense(LnA){n) 1 
dense{A){n) 2 

S* e C, we can show that L satisfies Equation ([2]), and thus L cannot belong to C. 

For any language family V, we say that V is C-pseudorandom (resp., weak C-pseudorandom) if V contains 
a C-pseudorandom (resp., weak C-pseudorandom) language. In fact, as we shall show later, CFL is REG- 
pseudorandom. 

Meanwhile, we want to explore useful characteristics of (weak) C-pseudorandom languages. The following 
lemma gives other characterizations of weak C-pseudorandomness. 

Lemma 7.1 Assume that > 2. Let C be any language family that is closed under complementation. 
For every set 5 C S*, the following three statements are equivalent. 
L S is weak C-pseudorandom. 



2. The function l{n) = 

3. The function £"(n) 
over E. 



dense{SAA){n) 1 
|S"| - 2 



dense{SnA}{n) dense{SnA){n 



is negligible for every p-dense language A € C over E. 

is negligible for every p-dense language A £ C 



Notice that the statements (2) and (3) are still equivalent although we remove a requirement of the 
p-denseness of A. Hence, with an appropriate change, a similar characterization of C-pseudorandomness 
follows. For a later reference, we call this fact a "pseudorandom" version of Lemma [7TT1 

Proof of Lemma 17. 11 Let E be our alphabet with |E| > 2 and let S be any language over E. We use the 
following abbreviation: write S'„ for SHE" and Sn for 5" fl E". A language family C satisfies C = co-C. 
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(1 ^ 2) Assume (1). Choose an arbitrary non-zero polynomial p and also any p-dense language A in C. 
By the p-denseness of A, there exists another non-zero polynomial q satisfying that \An\ > |S]"|/g(n) for all 
but finitely-many numbers n. Hereafter, we assume that n is a sufficiently large number. From (1) follows 



the inequality 



dense(SnA)(n) 
dense{A) (n) 



< l/2p(n), which is equivalent to \\An n S'„| — \An n Sn\\ < \An\/p{n). 

The closure property of C under complementation implies that A is also in C. Hence, similar to the case of 
A, we obtain another inequality ||A„ n Sn\ — \An n S'„|| < \An\/p{n). 



Since 15'^^ AA„| = |A„ r]Sn\ + \A„ n Sn\ and \S„AA, 



\2\SnAA, 



< 



< 



\\SnAA,,\-\S,,AAn\\ 

\\Ar.nSn\~\AnSn\\ + \\AnnSn 
\A„\ , Kl _ 



\An nSn\ + \An H S"™ | , it foUows that 

Kn^„|| 



pin) p{n) 



Using this inequality, we obtain 



dense{SAA){j 



p{n)' 
|2|^„AA„| - 



< 



1 



p{n) 



Since p is arbitrary, the above bound oi l{n) clearly implies (2). 

(2 =4» 3) Assume (2). Let p be any non-zero polynomial and let A be any p-dense language in C. From 

(2), we can assume that i{n) ~ ^ 5 — I/^pI*^) for any sufficiently large number n. Since E* € C, 

it also holds that jffj - 5 < l/2p(n). Hence, since jl^n n A„| - \Sn n A„|| = ||5„Ayl„| - |S'„| 
bound the term I" {n) as 

'\SnC^An\-\SnC^^ 



l"{n) 



we can 



< 



|^„AA„| 


1 






1 


|S"| 


~ 2 


+ 




^ 2 



< 



2p{n) 2p{n) 



p{n)' 



Therefore, (3) holds. 

(3 =^ 1) Assume (3). For any non-zero polynomial p and any p-dense language A in C, take a certain 
non-zero polynomial q such that |A„| > |S"|/g(n) for any sufficiently large number n. We then have 



£'(n) = 



IS^nA 
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\An\ 



|5„nA„| - |S'„n A„| 



l^nl 



<qin) 



|5„n A„| - \Snr\AJ 



IS"! 



Since 



\S„nA„\-\s„nA„\ 

|E"| 



< 



in)q{n) the abovc inequality implies that £'{n) < The arbitrariness 

□ 



of p leads to a conclusion that £'{n) is negligible, or equivalently (1) holds. 

Notice that the two implications (2 => 3) and (3 => 1) in the proof of Lemma [73] require no extra closure 
property of C. As an immediate consequence, we can draw the following relationship for any language family 
C. 

Corollary 7.2 Every C -pseudorandom language is weak C -pseudorandom. 

We further argue that weak C-pseudorandomness implies C-primeimmunity. This implication bridges 
between primeimmunity and pseudorandomness. 

Lemma 7.3 Let C be any language family, which is closed under complementation. Every weak C -pseudorandom 
language is C -bi-primeimmune. 

Proof. Let S be any weak C-pseudorandom language. Assuming that S is not C-primeimniune, we 
take a p-dense subset A of S' in C. From the p-denseness of A, there exist a non-zero polynomial p and 
a constant uq G N"*" satisfying that |A„| > 2"/p(n) for all numbers n > uq. Since A C S*, it follows that 



|s„nA„| 1 




|A„| 1 


|A„| 2 




|A„| 2 



> |1 — 1/2| — 1/2, which is clearly not negligible. Hence, S is not weak 



C-pseudorandom. A similar argument can be carried out under the assumption that S is not C-primeimmune. 
As a consequence, S is C-bi-primeimmune. □ 

The converse of Lemma 17.31 however, does not hold in general. As a counterexample, we present a 
context-free language that is REG/n-primeimmune but not weak REG/n-pseudorandom. Our example is 
the language EquaU, defined in Sectional 
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Proposition 7.4 The language family CFL contains a KEG/n-primeimmune language that is not weak 
'REiG / n-pseudorandom. 

Proof. In Proposition l6.2l the context-free language Equals, is shown to be REG/n-primeimmune. Hence, 
our remaining task is to show that Equah is not weak REG/n-pseudorandom. Choose ^ = S* in REG/rt 

and consider the function i{n) = _ i _ Obviously, i{n) is bounded by 



£{n) 



2" — dense{Equal.^){n) 1 



2" 2 

1 ( „/2 ) . 1 2.1 



1 dense{EquaU){n) 

2 2^ 



2 2" - 2 ./li - A 



for any sufficiently large number n, because { ) ^ ^ Vttk^ — ^v^' ^("^^ ^1/4, Equah cannot be 

weak REG-pseudorandom. □ 

Proposition l6.2l makes CFL to be REG/n-primeimmune. We shall strengthen CFL's REG/n-primeimmunity 
by proving that CFL is actually REG/n-pseudorandom. Since the REG/n-pseudorandomness implies the 
REG/n-bi-primeimmunity by Corollary 17.21 and Lemma f7.3l we can conclude that CFL is also REG/n-bi- 
primeimmune, as stated in Section [6l 

Proposition 7.5 The language family CFL is KEG / n-pseudorandom. 

To prove Proposition l7.51 we introduce a context-free language /P*. First, let us define the (binary) inner 
product of X and y as x Qy = X^ILi •^i ' Vi-i where x = X1X2 ■ • ■ Xn and y = yiy2 • • ■ J/n are n-bit strings. The 
language 7P* is defined as as IP* — {auv \ a £ {A, 0, 1}, |u| — \v\,u^Qv = 1 (mod 2)}. Now, we demonstrate 
that IP* is context-free. Let us consider the following npda M . On input auv, we nondeterministically check 
two possibilities. Along one computation path, we assume that a — X, and we nondeterministically check if 
|u| = \v\ and u^Qv = 1 (mod 2). On the other path, we assume that a ^ A, and we ignore the first bit a and 
check if |m| — \v\ and u^Qv = 1 (mod 2). The latter condition u^Qv = 1 (mod 2) can be checked by storing 
It in a (first-in last-out) stack and then computing each u„/2_i • Vi while reading Vi, where j = 1, 2, . . . , n/2. 

The reader may heed an attention to the fact that IP* is REG-levelable, because, by Lemma 14.51 
fiauv) = aOuwO is a length-increasing 1-DLIN-invertible 1-DLIN-m-autoreduction for IP*. 

Our proof of Proposition 17.51 requires a certain unique property of REG/n, called a swapping property, 
which has a loose similarity with the swapping lemma for regular languages. 

Lemma 7.6 [swapping property lemma] Let S he any language over an alphabet S. If S £ REG/n, then 
there exists a positive integer m that satisfies the following property. For any three numbers n, £1 (n) , £2 (n) G N 
with £i{n) + £2{n) — n, there is a group of disjoint sets, say, s["\ •S'j"'', . . . , Sm^ such that (i) S C] S" = 
ySiLi ^i"^ ^"'^ (^^) (swapping property) for any pair x,y £ A^i^\ if x = X1X2 and y = yij/2 with \xj\ = \yj\ — 
£j{n) for each index j g {1,2}, then a swapped strings Xiy2 and 2/1X2 are in A^^^K 

Proof. From our assumption S E REG/n, we choose a dfa M with a set Q of inner states, and an advice 
function /i : N ^ F* with \h{n)\ = n satisfying that, for every string x £ E*, x & A iS M accepts [ ,,(^^f^ ]. 
Assume that Q = {qi,q2, ■ ■ ■ , 9m} with m > 1. For any numbers n, £i{n), £2{n) € N with £i{n) + £2{n) = n, 
we define s[^'^ as the set {xiX2 G 5n S" | |a;i| = £i{n), \x2\ = £2{n), M enters qi after reading [ j^J ]}, where 

hi satisfies h{n) — /11/12 and \hi\ — £i{n). It is clear that 5 n S" = UIILi ^[^^ ■ X1X2 and yiy2 are in S^[^\ 
then M's inner state after reading either [ ] or [ ] are the same state qi. Since M accepts both [ ^^^^5^ ] 

and [ lY^^ ], M also accepts both [ 1 ^nd [ ]. This imphes that xiy2 and 2/1X2 belong to s\'^\ □ 



Now, we are ready to give the proof of Proposition [731 In the proof, we utilize a well-known discrepancy 
upper bound of the inner-product-modulo-two function. 



Proof of Proposition [7751 We shall show that IP* is REG/n-pseudorandom. Assume on the contrary 
that, by a "pseudorandom" version of Lemma l7.H there is a set S in REG/n, a polynomial p, and an infinite 

set / C N such that £"{n) ^def ^ ^ > l/p{n) for all numbers n e I. Take 

a constant m given in Lemma 17.61 Let n be any sufficiently large number in / satisfying m < 2"/® and 
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p{n) < 2"/^, and consider any input auv of length n. It is sufficient to check the case where n is even (that 
is, a = A), because, when n is odd, we can ignore the first bit a and follow a similar argument. Abbreviate 
S n /P* n S" and S n /P* n S" by Ui and J7o, respectively. From our assumption, it then follows that 
||t/i|-|C/o||=r(n)|I]"|>2"M^). 

By setting ^i(n) — i2{n) — n/2. we take S'j^""' Sm'' given by Lemma l7.6l Let us consider two partitions 

Uo = U,e[i,™]z and Ui = U,6[i,™], where [/f ^ = /P, n 5,^"^ and U^'^ =7P,n S^^^l Toward our 
desired contradiction, we aim at proving the inequality ||J7i| — |C/o|| < 2"/p(n). For our purpose, we claim 
the following. 



Claim 6 For all indices i G [1,to]z, \u[^^ \ — {Uq'^I 
From this claim, since m < 2"/*, it follows that 



< 23«/4. 



|C/i|-|C/o||< E pl'^l-lt^fl 

ie[l,m]z 



< 



2" 



This consequence obviously contradicts our assumption that \\Ui\ — \Uq\\ > 2"/p(n). Hence, the proposition 
follows immediately. 

Now, we give the proof of Claim El For this proof, we need a discrepancy upper bound of the inner- 
product-modulo-two function. Let M be a S"/^-by-I]"/^ matrix whose {x, ?/)-entry has a value xQy (mod 2). 

The discrepancy of a rectangle Ax B in M is DiscM{A x B) — 2^" #^*^-'(A x B) — ^\^'^\a x B) , where 

=f^l^\A X B) means the total number of 1 entries in M when Af's entires are limited to A x i?. It is known 
that DisCM{AxB) < 2-^"/'^y/\A\\B\ < 2""/'* (see, e.g., ^ Example 12.14]). Ahhough it is not quite tight, 
this loose bound still serves well for our purpose. 

For each index i £ [1,to]z, we define two sets Ai ^ {u € E"/^ | 3v £ e 5^"^]} and B, = {v e 

]}, and we claim the following equation. 

Claim 7 For each bit b, #[^'^\a, x B,) = \u'^\. 

It is clear from this claim that 2~"||[/{*^| — \Uq''\\ — DiscuiAi x Bi) < 2^"/"'. This inequahty imphes 
that ||f/f^| -\U^'''\\< 23"/4 as in Claim El 

To end our proof, we shall prove Claim [71 Consider the case 6 = 0. The other case is similar and omitted 
here. First, let A'^ be another S"/^-by-S"/^ matrix in which the value of each {x, j/)-entry is x^ y (mod 2). 
Obviously, we have ^'^^^^Ai x Bi) — ^\^\Af' x Bi), where Af^ — {w^ \ w £ Ai}. Second, we show that 
Af^ X Bi = S'l"'' by identifying (u,w) with uv whenever \u\ — \v\. This is shown as follows. Assume that 
uv € S"!"'. By the definitions of Ai and Bi, it follows that G Ai and v & Bi] hence, {u,v) € Af x Bi. 
Conversely, assume that {u, v) G Af- x Bi. Take u, -0 £ S"/^ such that uv € s\^^ and iiv € s[^\ The swapping 
property of s["'^ given in Lemma 17.61 implies that uv e S\^\ Therefore, it holds that Af x Bi — S\^\ 

From the above two equations, it follows that #f^\Ai x B,) = #'i^\Af^ x P,) = 1 5'!"^ n /P*| = |[/($"^|. 
From this equation. Claim [7] follows. □ 

To close this section, we exhibit a closure property of the family of C-pseudorandom languages under a 
certain relation between two languages. Two languages A and B over the same alphabet S are said to be 
almost equal if the function 5{n) — '^'^"■^'^(^^-^)(") jg negligible. Note that this binary relation is actually an 
equivalence relation (satisfying reflexivity, symmetry, and transitivity). 

Lemma 7.7 Let C be any language family and let A and B be any two languages over an alphabet S. IJ A 
and B are almost equal and A is C-pseudorandom, then B is also C-pseudorandom. 

Proof. Let A and B be any two languages over an alphabet E. We assume that A is C-pseudorandom for 
a language family C and that A and B are almost equal. As before, we use the following abbreviation: for 
each number rt S N and a language D, write P„ and Dn for P n S" and P n E", respectively. To show the 
C-pseudorandomness of P, let p be any non-zero polynomial and let n be any number, which is sufficiently 
large to withstand our argument that proceeds in the rest of this proof. 



It sufRces to show that (*) ^^I'^'j^"^ — \ < pin} - ^i^^e A is C-pseudorandom, it holds that 



|A„AC„| _ 1 
|E-| 2 



< 
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2p]^n) • Moreover, since A and B are almost equal, we have ^^j^'^"^ < 4^^^- '^"^^ difficult to show that 
A and B are also almost equal; thus, it follows that - — ^|^^^ | " ' ^ 4p^nJ' 
We first give an upper bound of ||i3„AC„| — |A„AC„||. Note that 

||B„AC„| - lA^ACnW < \ \B„nCn\ - |A„ n c„|| + ||B„ n c„| - K n c„|| . 

The first term of the right side of the above formula is bounded by 

||5„nc„| - |A„nc„|| < \A„nBn\ + \B„nA„\ ^ |A„ab„|. 

A similar bound is given for ||i?„ n C„| — \An H C„||. Combining these two bounds leads to 

1 1 1 



|B„AC„| - |A„AC7„|| < \A„AB^\ + |A„AB„| < 



From this bound, our desired inequality (*) follows as 



4p(n) 4p(n) 2p{n) 



\BnACn\ 



< 



I An A C„ I 



|B„AC„| - |A„AC„ 



< 



2p{n) 2p{n) p{n) 



Since C is arbitrary, we conclude the C-pseudorandomness of i?, as requested. 



□ 



8 Pseudorandom Generators 

Rather than determining the pseudorandomness of strings, we intend to produce pseudorandom strings. A 
function that generates such strings, known as a pseudorandom generator, is an important cryptographic 
primitive, and a large volume of work has been dedicated to its theoretical and practical applications. In 
accordance with this paper's main theme of formal language theory, we define our pseudorandom generator so 
that it fools "languages" rather than "probabilistic algorithms" as in its conventional definition in, e.g., [13j . 
A similar treatment also appears in designing of generators that fool "Boolean circuits." For ease of notation, 
we always denote the binary alphabet {0, 1} by S. Let us recall the notation XA, which expresses the 
characteristic function of A. In cryptography, we often restrict our interest to a function G that maps E* to 
S* with a stretch /actofl.(n); namely, |G(.)| =. (I a;|) holds for all strings a; G S*. Such a function G is said 
to fool a language A over E if the function £{n) ~ \Prohx[xA{G{x)) = 1] — Proby[xA(y) = 1]| is negligible, 
where x and y are random variables over S" and S'''^"\ respectively. We often call an input x fed to G a 
seed. A function G is called a pseudorandom generator against a language family C if G fools every language 
A over E in C. Taking the significance of p-denseness into our consideration, we also introduce a weaker form 
of pseudorandom generator, which fools only p-dense languages. Formally, a weak pseudorandom generator 
against C is a function that fools every p-dense language over S in C Obviously, every pseudorandom 
generator is a weak pseudorandom generator. As shown later, C-pseudorandomness has a close connection 
to pseudorandom generators against C. 

In particular, this paper draws our attention to "almost one-to-one" pseudorandom generators. A gen- 
erator G with the stretch factor n -I- 1 is called almost 1-1 if there is a negligible function T{n) > such that 
\{G{x) I X e E"}! = |E"|(1 - r(n)) for aU numbers n e N. 

Recall from Section [2] the single- valued total function class CFLSVt, which includes 1-FLIN as a proper 
subclass (because REG = CFL if 1-FLIN = CFLSVt). In the rest of this section, we aim at proving that 
CFLSVt contains an almost 1-1 pseudorandom generator against REG/n. 

Proposition 8.1 There exists an almost 1-1 pseudorandom generator in CFLSVt against REG/n. 

To prove this proposition, let us discuss a close relation between two notions: C-pseudorandomness and 
pseudorandom generators against C. Our key lemma below states that any almost 1-1 (weak) pseudorandom 
generator against C can be characterized by the notion of (weak) C-pseudorandomness. 

Lemma 8.2 Let S = {0, 1}. Let C he any language family that is closed under complementation. Let G he 
any almost 1-1 function from S* to S* wit h the stretch factor n -f 1 . 

*This factor is also called an expansion factor in, e.g., [13J . 
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1. G is a pseudorandom generator against C iff the range S = {G{x) | x G S*} o/G is an C -pseudorandom 
set. 

2. G is a weak pseudorandom generator against C iff the range S = {G{x) \ x ^ Ti*} of G is a weak 
C -pseudorandom set. 

Proof. Let C be any language family such that C — co-C. Assume that G is an almost 1-1 function 
stretching n-bit seeds to [n + l)-bit strings. Consider G"s range S — {G{x) \ x E S*}. For any lan- 
guage B over S and for each length n £ N, Bn+i denotes B n 1]"+^ and i?„+i denotes B n In 
particular, Sn+i equals {G{x) \ x £ S"}. Since G is almost 1-1, it holds that |S'n+i| — |S"|(1 — r(n)) 
for a certain negligible function t(?i) > 0. In other words, — |<S'n+i| — |S"|r(7i). We write tB{n) for 
\PvohxeT.^[XB{G{x)) = 1] — Probyg5]n+i = 1]|- Henceforth, we want to show only (1) since (2) can be 

proven similarly. 

(Only If - part) Assume that G is a pseudorandom generator against C. Let B be any language in 
C. Since G fools B, the function £B(n) is negligible. Let p be any non-zero polynomial. Assume that n 
is sufficiently large so that iB{n) < l/2p{n) and r(n) < l/2p{n). It thus follows that |E"| — l^n+ij = 
|I]"|r(n) < |S"|/2p(n). We set 6^ and e„ to satisfy that Eye5„+inB,.+i \G'Hy)\ = Sn |5„+i nB„+i| and 
E,es„+,ns„+i \G-Hy)\ = n B„+i|. Obviously, <5„,e„ > 1. Since = E,es„+i |G-^(y)| and 

E \G-Hy)\= E \G''(.y)\+ E_ \G-\y)l 



we have 

|S"| = 5„|5„+inB„+i| 

Using this relation, since e„ > 1, we have 



n+l\ 



i)|5„+inB„+i| 

< {6n - 1) \Sn+l n Bn+1 \ + (e„ - 1) | Sn+1 H B„+i 
= {Sn \ Sn+l n Bn+l \ + \ Sn+l H Bn+1 

|S"| 



~ (l'S'n+1 n -B„+i| + \s„+i n i3„+i|) 



2p{n) ' 



Therefore, it holds that ((5„ — l)|5„+i n i?„+i| < |S"|/2p(n). We will use this inequality later. 

■|s„+inB„+i|-|S„+inB„+ir 



Next, we want to estimate ig{n) 



With (5„, we obtain 



Prob 



.[XBiG{x)) = l] 



Since ProbygS"+i [xs(?/) = 1] = |S„+i|/|S"|, ^s(n) equals 

Sn\Sn+l r\ Bn+l\ |-Bri+l| 



which is lower-bounded by 



Sn\Sn+l r\ Bn+l\ \Sn+l C] B„ + l\ 



\S^ 



n-l-1 



(IB, 



n+l\ 



|<S'ri+l n Bn+l \ — \Sn+l H i?„+l| 



2 - 1) 



|E"+i| 

l'5'n+l n -B„+l| 



From our assumption isin) < l/2p{n), we can conclude that 



|S'„+i n 



n+l| 



1^ 



n-l-l 



n-|-l| 



IS' 



< 



2p(n: 



+ 2(J„-i) \Sn+i_nB^+,\ ^ 1 



p(n)' 



where the last inequality follows from the previous bound ((5„ — l)|S'„+i n i?„+i| < |S"|/2p(n). 
Apply a "C-pseudorandom" version of Lemma 1771] and we obtain the C-pseudorandomness of S. 
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(If - part) Assume that the set S — {G{x) \ x £ S*} is C-pseudorandom. To show that G is a 
pseudorandom generator against C, we want to show that the function Isin) is neghgible for any language 
B in C. Let p be any non-zero polynomial and let B be any language in C. Since S is C-pseudorandom, by 
a "C-pseudorandom" version of Lemma 17.11 the following term 



'An) 



\Sn+l n i?„+l| — |S'n+l n Bn+l\ 

|E»+i| 



\Sn+l n -6,1+1 1 l-Bri+ll 



is upper-bounded by l/2p{n) for all but finitely- many numbers n. 

We choose two numbers (5„ and e„ so that PmhxeS"[xBiG{x)) = 1] = 5„|S'„+i n and 
Proba;eE"[x-B(G(?/)) = 1] = e„|S'„+i n B„+i|. Since (5„ > 1, it follows that 



((5„ — l)|S'„+i n i?„+i| /S'n+i n Bn+i 



from which we bound isin) as 

((5„ - i)|S'„+i ns„+i| 



IB. 



n-|-l| 



|S"| 



|S'„+i n i?„+i| l-Bn+ll 



< 



((5„-i)|5„+inB„+i| , 1 



2p{n) ' 



Since e„ > 1, we further estimate isin) as 

((5„-l)|S'„+inB„+i| (e„-l)|S'„+inB„+i| 



2p(n) 



IS" I 



2p(n) 



< 



2p(n) 2p{n) 



< 



p{n) 



Therefore, we obtain isin) < l/p(n). From the arbitrariness of _B in C, we can conclude that G is a 
pseudorandom generator against C. □ 

Let us describe the proof of Proposition 18. II First, recall the context-free language /P* given in Section 
[71 We want to build our desired pseudorandom generator based on the REG/n-pseudorandomness of /P*. 



Proof of Proposition 18.11 The desired generator G is defined as follows. Let n be an arbitrary number 
> 3 and let w = axy be any input of length n satisfying a € {A, 0, 1} and |a;| = \y\ + 1. We consider the 
first case where n is odd (i.e., a — X), assuming further that x — bz for a certain bit b. Since n is odd, let 
k — [n — l)/2. As described below, our generator G outputs a string of the form x'y'e of length rt + 1, where 
|a;'| = |a;|, |j/'| = |j/|,andee{0,l}. 

(1) li w ~ bzy for a certain bit b and Q y =1 (mod 2), then let G(w) = bzyb . 

(2) liw = Izy and z^ Q y = Q (mod 2), then let G(w) = Izyl. 

(3) If w = Qzy and z^ Q y = (mod 2), then check if there is the minimal index i such that Zk-i+i = 1. 

(3a) Consider the case where such i exists. In this case, let G(w) = OzyO, where y is obtained from y 

by flipping the jth bit; that is, y = yiy2 ■ ■ ■ yi-iJjiyi+i ■ --yk- 
(3b) Consider the other case where i does not exist; in other words, z = O''. In this case, we define 

G{w) = Izyl. 

In the remaining case where n is even (i.e., a G {0, 1}), we simply define G{w) = au, where u = G{xy). 

Our next goal is set to show that G is a pseudorandom generator in CFLSVt against REG/n. We begin 
with the claim that G is an almost 1-1 function. 

Claim 8 G is almost 1-1. 

Proof. Consider the case where n is odd, and set k = (n — l)/2 as before. In the above definition of G 
, it is not difficult to check that all the cases except Case (3b) make G one-to-one. It is thus sufficient to 
deal with Case (3b). In this case, for each fixed string y £T,^, only inputs taken from the set {OO'^y, lO'^y} 
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are mapped by G into the string lO'^yl. Now, we define T(n) 1/2''' 1/2*^" ^^/^). Letting Ak denote 
Uj^gj^fc {OO'^y, lO'^y}, we note that G is one-to-one on the domain S" — Ak and 2-to-l on the domain Ak- 
Hence, since \Ak\ — 2*^+^, it thus foUows that 

\{G{w) I w G = |S" - Afel + = 2" - 2'= = (l - ^) • 

The other case where n is even follows from the previous case and we can define t accordingly. Clearly, t is 
negligible, and therefore G is almost 1-1. □ 



Claim 9 The range S = {G(w) | w G E*} of G coincides with IP^,. 

Proof. The containment S C IP^, can be shown as follows. Let w G S" be any input string and we 
want to show that G{w) G IP*. Now, assume that n is odd. Let us consider Case (1) with w — bzy and 
Q y = 1 (mod 2). In this case, G{w) = bzyb. Since (yb) = z^ Q y + b Qb = 1 (mod 2), 

it follows that G(w) G /P*. Next, we consider Case (3a) with w = Qzy and z^ Q y = (mod 2). Let 
j = min{« I Zk-i+i = 1}. Notice that Zk-j+i ■ yj ^ z/c-j+i ■ Vy We have 

z^Qy = ^ Zk-i+m + Zk-j+i • j/j ^ ^ Zk-i+m + Zk-j+i ■ Vj = z^Qy. 

Thus, we have z^ Qy = 1, which implies that G{w) G /P*. The other cases are similarly shown. 

We then show the other containment JP, C S. Choose an arbitrary string u G /P* n S" and assume 
that n is even. Let k = {n — 2)/2. Consider the case where u — bzyb with b G {0, 1} and \z\ = \y \ = k. Since 
u G IP*, we have (bz)^ (yb) = z^ Q y = 1 (mod 2). Hence, G maps w — bzy to u. This means that u is 
in 5*. Next, we consider the case where u — OzyO with \z\ = \y\. Let « = min{z | Zk-i+i = 1}. As before, 
we define y from y by flipping the iih bit of y. Hence, G{Ozy) equals OzyO, which obviously equals u. Thus, 
u Cz S. The other cases are similarly proven. □ 

Since IP* is REG/n-pseudorandom, by Claim S is also REG/n-pseudorandom. From G"s almost 
one-oneness. Lemma 18.21 guarantees that G is a pseudorandom generator against REG/n. What remains 
unproven is that G actually belongs to CFLSVt. 

Claim 10 G is m CFLSVt. 

Proof. Here, we give an upda with a write-only output tape, which computes G. Our npda N works 
as follows. On input w = axy, guess nondeterministically whether a = A or not. Along a nondeterministic 
branch associated with a guess "a = A," check nondeterministically whether = |y| + 1 using a stack as 
storage space. During this checking process, N also computes z^ Q y and finds the minimal index io such 
that Zk-ig+i = 1 (if any). While reading input bits, for each nondeterministic computation, TV produces 
three types of additional computation paths. Along the first one of such paths, N writes lO^^yl on its output 
tape; on the second path, N writes bxy on the output tape; on the third path, N writes OzyO, provided that 
io exists. At the end of scanning the input, if Case (3b) does not hold, N enters a rejecting state on the 
first path to invalidate its output lO'^yl. If Case (3a) does not hold, N also invalidate its output OzyO on 
the third path. In Cases (l)-(2), assume that N has written bxy on the second path. Now, N writes down 
b or 1, respectively, on the output tape following bxy if Case (1) or case (2) holds. It is not difficult to show 
that, for each input string w, iV's valid output is unique and matches G{w). This npda N therefore places 
G into CFLSVt. □ 

To this end, we have already completed our proof of Proposition l8.ll □ 

To close this section, we demonstrate another application of Lemma 18.21 concerning the non-existence of 
a certain weak pseudorandom generator. 

Proposition 8.3 There is no almost 1-1 weak pseudorandom generator in 1-FLIN with the stretch factor 
n + 1 against REG. 

Our proof of this proposition demands new terminology. For any two multi-valued partial functions / 
and g mapping S* to F*, where F could be another alphabet, / is called a refinement of g if, for any string 
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a; e E*, (i) f{x) C g[x) (set inclusion) and (ii) f{x) — implies g{x) = 0. Concerning 1-NLINMV, Tadaki 
et al. [26j proved that every length-preserving function in 1-NLINMV has a refinement in 1-FLIN (partial). 
Now, we give the proof of Proposition 18. 31 

Proof of Proposition l8.3l Let G be any almost 1-1 weak pseudorandom generator against REG stretching 
n-bit seeds to (n + l)-bit long strings. Toward a contradiction, we assume that G belongs to 1-FLIN. By 
Lemma 18.21 the range S = {G{x) \ x G S*} is weak REG-pseudorandom. If S is regular, then REG 
is weak REG-pseudorandom; however, this contradicts the self-exclusion property: REG cannot be weak 
REG-pseudorandom. To obtain this contradiction, it remains to prove that S' is a regular language. 

To make G length-preserving, we slightly expand G and define G[xh) — G{x) for each string x and each 
bit b. This new function G is also in 1-FLIN. Let us consider its inverse function G~^{y) = {x \ G{x) — y}. 
Obviously, the inverse function G~^ belongs to 1-NLINMV (by guessing x and then checking if G{x) = y). 
Since every length-preserving function in 1-NLINMV has a refinement in l-FLIN(partial) [2S], there exists a 
refinement g S l-FLIN(partial) of G~^, and we denote by iV a one-tape one-head linear-time deterministic 
Turing machine that computes g. 

Claim 11 For every string y, y G S iff N on the input y terminates with an accepting state. 

As a consequence of Claim fTTl S is in l-DTIME(0(n)), which equals REG [H]. We thus obtain the 
regularity of S", as we have planned. 

Finally, we want to prove Claim [TT] Assume that y is in S", meaning that G^^{y) ^ 0. Since g is a 
refinement of G~^ , we have g{y) ^ 0, which indicates that N terminates with an accepting state. Conversely, 
assume that N on y terminates with an accepting state. In other words, g{y) ^ 0. Since g{y) C G~^{y), 
we obtain G~^{y) ^ 0. This implies that y = G{x) for a certain string x. Since S = {G{x) \ x G S*}, it 
immediately follows that y € S. Therefore, Claim [TT] holds. □ 



9 Discussion and Open Problems 

We have discussed two fundamental notions — immunity and pseudorandomness — in a framework of formal 
language theory. Our main target of this paper is the context-free language. Our initial study in this paper 
has revealed a quite rich structure that lies inside CFL. For instance, CFL contains complex languages, 
which are REG-immune, CFL-simple, and REG/n-pseudorandom. Moreover, its function class CFLSVt 
contains a pseudorandom generator against REG/n. 

There remain several key questions that we have not answered throughout this paper. To direct future 
research, we generate a short list of those questions for the interested reader. 

1. As shown in Section[5l LnREG/n is REG-bi-immune. Determine whether CFL is also REG-bi-immune. 
More strongly, is CFL - REG/n REG-bi-immune? 

2. Prove or disprove that CFL(2) — CFL/n is CFL-immune. 

3. Is there any context-free language that is p-dense REG-immune? Is one of such languages located 
outside of REG/n? 

4. The languages Lkeq, where fc > 3, are shown to be CFL-simple; however, they cannot be REG-immune. 
Is there any REG-immune CFL-simple language? 

5. We can define the notion of "CFL-primesimplicity" analogous to "CFL-simplicity." Find natural CFL- 
primesimple languages. 

6. Is DCFL REG/n-pseudorandom? An affirmative answer implies the REG/n-bi-primeimmunity of 
DCFL. 

7. As noted in Section [31 the language L^eq belongs to CFL(2) and it is also CFL(l)-immune. In short, 
CFL(2) is CFL(l)-immune. Naturally, we can ask if, for each index fc > 2, CFL(fc + 1) is CFL(fc)- 
immune. 

8. Our pseudorandom generator G given in Section [S] is almost 1-1. Find a natural 1-1 pseudorandom 
generator against REG/ti. 

9. Find a natural and easy-to-compute pseudorandom generator against CFL/n. 

The answers to the above questions will surely enrich our knowledge on context-free languages. 
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